Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models

Identifieur interne : 001386 ( Main/Exploration ); précédent : 001385; suivant : 001387

Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models

Auteurs : Thanaruk Theeramunkong [Thaïlande] ; Chainat Wongtapan [Thaïlande]

Source :

RBID : Pascal:05-0041913

Descripteurs français

English descriptors

Abstract

Many traditional works on off-line Thai handwritten character recognition used a set of local features including circles, concavity, endpoints and lines to recognize hand-printed characters. However, in natural handwriting, these local features are often missing due to rough or quick writing, resulting in dramatic reduction of recognition accuracy. Instead of using such local features, this paper presents a method called multi-directional island-based projection to extract global features from handwritten characters. As the recognition model, two statistical approaches, namely interpolated n-gram model (n-gram) and hidden Markov model (HMM), are proposed. The experimental results indicate that the proposed scheme achieves high accuracy in the recognition of naturally-written Thai characters with numerous variations, compared to some common previous feature extraction techniques. Another experiment with English characters also displays quite promising results.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models</title>
<author>
<name sortKey="Theeramunkong, Thanaruk" sort="Theeramunkong, Thanaruk" uniqKey="Theeramunkong T" first="Thanaruk" last="Theeramunkong">Thanaruk Theeramunkong</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Information Technology Program, Sirindhorn International Institute of Technology, Thammasat University</s1>
<s2>Pathumthani 12121</s2>
<s3>THA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Thaïlande</country>
<wicri:noRegion>Pathumthani 12121</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Wongtapan, Chainat" sort="Wongtapan, Chainat" uniqKey="Wongtapan C" first="Chainat" last="Wongtapan">Chainat Wongtapan</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>Department of Computer Science, Faculty of Science, Payap University</s1>
<s2>Chiangmai</s2>
<s3>THA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Thaïlande</country>
<wicri:noRegion>Chiangmai</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">05-0041913</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0041913 INIST</idno>
<idno type="RBID">Pascal:05-0041913</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000491</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000298</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000394</idno>
<idno type="wicri:doubleKey">0306-4573:2005:Theeramunkong T:off:line:isolated</idno>
<idno type="wicri:Area/Main/Merge">001424</idno>
<idno type="wicri:Area/Main/Curation">001386</idno>
<idno type="wicri:Area/Main/Exploration">001386</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models</title>
<author>
<name sortKey="Theeramunkong, Thanaruk" sort="Theeramunkong, Thanaruk" uniqKey="Theeramunkong T" first="Thanaruk" last="Theeramunkong">Thanaruk Theeramunkong</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Information Technology Program, Sirindhorn International Institute of Technology, Thammasat University</s1>
<s2>Pathumthani 12121</s2>
<s3>THA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Thaïlande</country>
<wicri:noRegion>Pathumthani 12121</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Wongtapan, Chainat" sort="Wongtapan, Chainat" uniqKey="Wongtapan C" first="Chainat" last="Wongtapan">Chainat Wongtapan</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>Department of Computer Science, Faculty of Science, Payap University</s1>
<s2>Chiangmai</s2>
<s3>THA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Thaïlande</country>
<wicri:noRegion>Chiangmai</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Information processing & management</title>
<title level="j" type="abbreviated">Inf. process. manag.</title>
<idno type="ISSN">0306-4573</idno>
<imprint>
<date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Information processing & management</title>
<title level="j" type="abbreviated">Inf. process. manag.</title>
<idno type="ISSN">0306-4573</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Hidden Markov model</term>
<term>Image processing</term>
<term>Manuscript character</term>
<term>Multigram</term>
<term>Thailand</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traitement image</term>
<term>Reconnaissance caractère</term>
<term>Caractère manuscrit</term>
<term>Modèle Markov caché</term>
<term>Thaïlande</term>
<term>Multigramme</term>
</keywords>
<keywords scheme="Wicri" type="geographic" xml:lang="fr">
<term>Thaïlande</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Many traditional works on off-line Thai handwritten character recognition used a set of local features including circles, concavity, endpoints and lines to recognize hand-printed characters. However, in natural handwriting, these local features are often missing due to rough or quick writing, resulting in dramatic reduction of recognition accuracy. Instead of using such local features, this paper presents a method called multi-directional island-based projection to extract global features from handwritten characters. As the recognition model, two statistical approaches, namely interpolated n-gram model (n-gram) and hidden Markov model (HMM), are proposed. The experimental results indicate that the proposed scheme achieves high accuracy in the recognition of naturally-written Thai characters with numerous variations, compared to some common previous feature extraction techniques. Another experiment with English characters also displays quite promising results.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Thaïlande</li>
</country>
</list>
<tree>
<country name="Thaïlande">
<noRegion>
<name sortKey="Theeramunkong, Thanaruk" sort="Theeramunkong, Thanaruk" uniqKey="Theeramunkong T" first="Thanaruk" last="Theeramunkong">Thanaruk Theeramunkong</name>
</noRegion>
<name sortKey="Wongtapan, Chainat" sort="Wongtapan, Chainat" uniqKey="Wongtapan C" first="Chainat" last="Wongtapan">Chainat Wongtapan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001386 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001386 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:05-0041913
   |texte=   Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024